Quantitatively assessing early detection strategies for mitigating COVID-19 and future pandemics

Researchers and policymakers have proposed systems to detect novel pathogens earlier than existing surveillance systems by monitoring samples from hospital patients, wastewater, and air travel, in order to mitigate future pandemics. How much benefit would such systems offer? We developed, empirically validated, and mathematically characterized a quantitative model that simulates disease spread and detection time for any given disease and detection system. We find that hospital monitoring could have detected COVID-19 in Wuhan 0.4 weeks earlier than it was actually discovered, at 2,300 cases (standard error: 76 cases) compared to 3,400 (standard error: 161 cases). Wastewater monitoring would not have accelerated COVID-19 detection in Wuhan, but provides benefit in smaller catchments and for asymptomatic or long-incubation diseases like polio or HIV/AIDS. Air travel monitoring does not accelerate outbreak detection in most scenarios we evaluated. In sum, early detection systems can substantially mitigate some future pandemics, but would not have changed the course of COVID-19.

To test the importance of early response, we asked whether countries that started COVID-19 lockdown earlier were better able to achieve low case counts post-lockdown (Fig. S1).We gathered (1) country lockdown dates from media reports (table S1), and combined these with (2) country-level COVID-19 confirmed case counts 74 to estimate the infectious number of cases in each country at the time of lockdown.We gathered complete data for 85 countries and only analyzed these countries.In 2020, countries which instituted lockdown before 1,000 infectious cases were much more likely to contain COVID-19 initially, and this earliness of lockdown was more predictive of lockdown success than lockdown duration (Fig. S1).A lockdown was deemed successful if the average number of daily cases in the 7 days following lockdown is fewer than 10 cases.Of the 85 countries, 68 started lockdown before 1,000 infectious cases.38% (26/68) of these countries with earlier lockdowns contained COVID-19 initially, compared to 0% (0/17) of countries with later lockdowns (a statistically significant difference at p = 0.0057).This is robust across many thresholds and definitions of lockdown success (Fig. S3) and caseload-based definitions of lockdown earliness (Fig. S4).Earliness of lockdown (measured by caseload) was also more predictive of lockdown success than earliness measured by the raw lockdown start date (Fig. S2).
This analysis has limitations.First, we do not account for significant variation among countries in the extent of COVID-19 testing, the number of imported cases (approximated by the amount of travel), demographics and age structure, country size and density, and other factors; all of these can affect measurements of lockdown success.Countries which did not test extensively may be recorded as having low cases post-lockdown and as having started lockdown early, which could partially explain the observed association between lockdown earliness and success.However, this does not seem to explain most of the relationship: countries like Thailand and New Zealand, which tested extensively per capita 75 , were among the countries with early, successful lockdowns.Second, we cannot definitively distinguish correlation from causation: the association between earlier lockdown and fewer cases post-lockdown may be because countries which were willing to implement lockdown earlier were also more willing to implement and comply with more stringent lockdowns.Nevertheless, the observed association between lockdown earliness and success is consistent with the hypothesis that lockdown earliness improves chances of success.
We used (1) and (2) to calculate (a) the infectious number of cases in each country at the time of lockdown (as a measure of the earliness of lockdown) and (b) the number of cases in each country following lockdown (as a measure of the lockdown's success).For (1), we only analyzed countries' first lockdowns and required these lockdowns to start before 2021-01-01 and last longer than 7 days.When a country had regional lockdowns which differed from the national lockdown, we used the start and end dates of the national lockdown.If the country had no national lockdown and only regional lockdowns, we used the dates of the regional lockdown with the median start date.
For (a), we used overall case counts to estimate the infectious cases at the lockdown start date by assuming (i) a 5-day infectious period 47 and therefore (ii) that the infectious cases on day T are 1/5 * (cases on day T-4) + 2/5 * (cases on day T-3) + … + 5/5 * (cases on day T).We also performed a sensitivity analysis using the raw case count average at the start of lockdown instead of the infectious case count (Fig. S4).For (b), we calculated the average daily new cases for the 7 days after the lockdown was lifted, and we considered a lockdown successful if this average was fewer than a threshold of 10 cases.We performed a sensitivity analysis for thresholds of 3, 10, 30 and 100 cases (Fig. S3).
To calculate statistical significance of the different lockdown success rates between countries with earlier and later lockdowns, we used the 2-sample test for equality of proportions implemented in R's prop.test.

Validation of model in US states.
In validating our model in US state data, we were able to predict US state detection times with a mean absolute error of 0.97 weeks (Fig. S6).We achieved this with a relatively simple validation setup: R0 was the only parameter we allowed to vary among states, which did not allow the model to account for differing state testing turnaround and capacity or inter-state variation in growth rate of imported cases.Gathering those data and accounting for those could improve the model's accuracy.(Other inter-state variables like differing age structures and demographics, as well as lockdown policies and pandemic-induced mobility changes, should be accounted for in the state-specific R0's.)

Derivation for mathematical approximation of cases until detection
As an intuitive summary of the derivation, we break down the number of cases until detection into two variables: (i) the cases that occur until the infection of the "threshold case" (the final case needed to trigger detection), and (ii) the cases that occur afterwards during the delay between the threshold case's infection and detection.In the formula, /  corresponds to (i), and each of those ( 0 − 1)/ 0 * /  cases in the last generation spawns an outbreak process proportional to ∑  0   =1 cases, corresponding to (ii).
Full derivation: Assume the outbreak starts in a community covered by the detection system.We want the mean and variance of the cumulative number of cases  which have occurred by the time the detection system is triggered.The outbreak occurs in generations, where the index case is generation 0 and each generation of cases creates the next generation of cases.We can express  as follows: where  is the number of cases infected until the threshold case is infected,   is the number of infectious cases in the -th generation after the threshold case is infected, and  is the number of generations which occur in the delay between the threshold case's infection and detection.Note first that

𝑇 ∼ 𝑁𝐵𝑖𝑛𝑜𝑚(𝑑, 𝑝 𝑡𝑒𝑠𝑡 )
where  is the detection threshold (the number of cases which need to be detected to constitute an outbreak) and   is the probability any particular case is tested.For example, in the hospital system,   = ℎ_.
For the mean of , we will need: The first expansion of [  ] derives from two facts: (a)   is the sum of approximately  0 −1  0  independent and identically distributed branching processes, so that the mean of   is times the mean of one branching process.(b) From branching process mathematics, the mean of   , the number of entities in the -th generation of a branching process, is [] 76 , where  is the offspring distribution (the distribution of the number of secondary cases infected by each primary case).In this study,  is negative binomial with mean  0 and dispersion 0.01 52 .Thus, Cases at first lockdown First lockdown length (days)       Table S2.Threshold, delay, and probability for 3 proposed early detection systems. 1 Detection threshold: The government or hospital implementing the system chooses the detection threshold they consider to be sufficient.For COVID-19, Wuhan hospitals were willing to report the "extraordinary" situation to local health authorities after seven known cases 69 .During the 2002-2004 SARS-CoV-1 outbreak, hospital officials became alarmed after one patient and eight doctors and nurses became sick 71 .Thus we choose a detection threshold of ten.
2 See Materials and methods for details on setting the detection threshold.Detection delay: 63,77 .Detection probability: Fecal shedding tends to constitute more of the human pathogen nucleic acid in wastewater than urine, saliva, or other specimens, due to higher rates of shedding and higher pathogen loads in feces 63,78 .The fraction of people connected to central sewage in Wuhan is estimated at 80% based on a 2016 Asian Development Bank appraisal stating that Wuhan aimed to treat this fraction of wastewater in 2010 79 ; this fraction is similar to the fraction of US households connected to public sewers (83%) 64 .
Table S3.Epidemiological parameters of outbreaks studied. 248,82.Due to the lack of infection hospitalization rates at this time, we infer the infection hospitalization rate to be 0.03 by halving the estimated case hospitalization rates of 0.06-0.07for the 2022 mpox outbreak 83,84 .We choose half because a majority of mpox infections are symptomatic 85 and some fraction of those will seek medical care and get tested.Time to hospitalization is estimated by adding the incubation period of 7 days to the median time from symptom onset to hospitalization (7 days) 86 .We and others are unable to find estimates of mpox fecal shedding rates 87 , but it has been detectable in wastewater during the 2022 mpox outbreak 88 , so we assign a value of 0.5, in line with SARS-CoV-2 and flu, but on the higher end because mpox causes symptoms more broadly than in just the respiratory system.
3 Due to lack of data and estimates of R0 for polio in 2022, we use an R0 of 1.6 from the Israel 2013-2014 wild poliovirus type 1 outbreak 60 to represent a polio outbreak in a population with sanitation systems and high levels of vaccination coverage 89,90 .Hospitalization rate is inferred from the fact that less than or near 1% of polio infections result in flaccid paralysis 91 .Serial interval is estimated as the latent period plus one half of the infectious period 92 : in the Israel outbreak, this was estimated as 1/ + 1/2 * 1/ = 4 + 1/2 * 1/0.93 ≈ 4.5 days (Table 2 in 60 ).Hospitalization time is inferred from the several-day period of minor illness, symptom-free period of 1-3 days, and then onset of paralysis within 2-3 days 91 .Probability of fecal shedding was inferred from literature estimates in enteroviruses 93 and in vaccinated children 94 . 495,96.The time to hospitalization is estimated as the sum of the incubation period (9-12 days 97 ) and the time from symptom onset to hospital admission (5.7 days 98 ).We and others are unable to find precise estimates of Ebola fecal shedding rates, but Ebola has commonly been detected in stool when measured 78 , so we assign a value of 0.5, in line with SARS-CoV-2 and flu, but on the higher end because Ebola causes symptoms more broadly than in just the respiratory system. 599,100.The hospitalization rate was estimated by multiplying the symptomatic hospitalization rate of 0.0144 (the proportion of symptomatic cases requiring hospitalization) 101 by the symptomatic rate of 0.8 (the proportion of all cases who were symptomatic) 102 .The hospitalization time was estimated as the sum of the incubation period (1.4 days 103 ) and the time from symptom onset to hospital admission (2 days 104 ).  . Probabilty of fecal shedding is calculated using estimates that 60% of HIV-positive patients show gastrointestinal symptoms 109 and 5/9 and 1/10 of HIV-positive patients showing and not showing gastrointestinal symptoms, respectively, test positive in fecal samples for HIV nucleic acid 110 .
7 These parameters are very loosely inspired by the parameters for long-incubation diseases like tuberculosis (assuming cases are untreated) [111][112][113] .Time to active disease is used as a proxy for time to hospitalization.The serial interval is estimated by taking estimates from the antibiotic era and subtracting 12 months to account for 12 months of antibiotics treatment, and this is consistent with the observed pre-antibiotic era incubation period of at least 1-1.5 months (assuming transmission starts approximately when symptoms appear), because the serial interval is the latent period plus half the infectious period.Reproductive number is selected from the higher end of 113 because most of the studies in that review are from the antibiotic era.Detection systems are generally modeled assuming that their implemention follows the details in various proposals.
In hospital monitoring, hospitals would test for high-priority pathogen families (e.g.coronaviruses) in patients presenting with severe infectious symptoms in hospital emergency departments 19 .Similarly, in wastewater monitoring, governments would test for pathogens in city wastewater treatment plants daily, and monitor for high and increasing levels of highpriority pathogen families 21 .In air travel monitoring, we model testing of individual symptomatic passengers (differs from proposals to monitor airplane sewage 22 or bridge air) on incoming international flights for the same pathogens.
Status quo detection is modeled as a partially implemented form of hospital monitoring (lower detection probability per case ptest).

Fig. S4 .
Fig. S4.Earliness of lockdown (x-axis) versus lockdown length in days (y-axis) and lockdown success (first lockdown unsuccessful (orange) and first lockdown successful (teal)) for 85 countries, analogous to fig.S1, except that earliness of lockdown is measured here in terms of all cases (rather than infectious cases) at lockdown.

Fig. S5 .
Fig. S5.Distribution of wastewater sensitivity estimated from reported COVID-19 incidences and wastewater sample data from 47 sampling locations from 53 .

Fig. S18 .
Fig. S18.Detection times estimated by main model versus Monte Carlo-based model with reproduction number 55 .The left panel is our original Fig. 2A using the original model; the right panel shows the same detection times from the more complex model.Each boxplot shows 100 simulations (points).
Durations and start and end dates of lockdowns in 85 countries.
CountryFig.S3.Earliness of lockdown (x-axis) versus lockdown length in days (y-axis) and lockdown success (first lockdown unsuccessful (orange) and first lockdown successful (teal)) for 85 countries, analogous to fig.S1, for 4 different thresholds of lockdown success (thresholds shown in gray labels).A lockdown is successful if the average number of daily cases following lockdown is less than the threshold for 7 days.
COVID-19 cases leading up to COVID-19 detection in 50 US states, which are used as the x-axis in fig.S6.Y-values here are literature estimates of total (tested plus untested) cases54.We extrapolated these cases based on exponential fit back to January 1, 2020.Dashed lines mark the date of the first detected case in the state, and the shaded areas under the curve denote the cumulative number of cases until detection.
b Fig. S9.Comparison of simulation model of cases until detection versus mathematical approximation (hospital (teal), wastewater (orange) and air travel (purple)) in a 650,000-person catchment.Solid lines are the means of simulated case counts; dashed lines are the approximated means based on the derived formula for cases at detection.Each column shows 100 simulations (points).

Table S4 . Date of first reported COVID-19 case in each of 50 US states.
Dates are sourced from media reports and state public health agency press releases.An index case is considered to be caught unusually early if caught earlier than 4 days after symptom onset.

Table S5 .
Assumptions in detection time model.This assumes detection systems are implemented broadly as proposed in19,21.It is unlikely that detection systems would be implemented in 100% of communities, but we assume coverage in at least the community of origin to show the benefits if such systems are fully funded and implemented.This can be relaxed with a corresponding increase in average detection time.Such multiplex testing may not catch completely novel pathogens, but this approach is applicable to most recent emerging pathogens such as SARS-CoV-2 (2019), Ebola (2013), MERS-CoV (2012), and pandemic flu (2009).Proposed technologies include multiplex PCR[41][42][43][44], CRISPR-based multiplex diagnostics45, and metagenomic sequencing46.Novel pathogens from multiplex testing can be distinguished from known pathogens by sequencing, but one can also apply the model to calculate detection times of new outbreaks of known pathogens.